

#### **Microkernel Construction** I.10 – Local IPC (Optimization for Multi-Threaded Applications)

Lecture Summer Term 2017 Wednesday 15:45-17:15 R 131, 50.34 (INFO)

#### Jens Kehne | Marius Hillenbrand Operating Systems Group, Department of Computer Science





## Tutoren für Betriebssysteme gesucht



### Was?

- Betreuung von Tutorium
- Korrektur von Abgaben (keine Notenvergabe!)
- HiWi-Job über je 40 Stunden von Oktober bis Februar
- Warum?
  - Präsentationstechnik üben!
  - Zugang zu Tutorenschulung (SQ, 4 ECTS)
- Interesse?
  - Mathias Gottschlag (R161)
  - mathias.gottschlag@kit.edu



# Synchronization via IPC





#### **AS-local IPC in Practice**





#### **AS-local IPC in Practice**





5 05.07.2017 Jens Kehne | Marius Hillenbrand – Microkernel Construction, SS 2017

#### **AS-local IPC in Practice**



Synchronization

6

Load Distribution

# Observations

- IPC operations are within same address space
- IPC operations have both blocking send and receive phases

# -- Introduce special Local IPC --

- Restrictions
  - Same address space
  - Must have both blocking send and receive phase
- Can execute entirely at user-level
- LIPC executes in ~20 cycles!

#### **User-Level Threads?**



## Would achieve required speed

- But ...
  - Not known to the kernel
  - Execute in a single thread's context
  - Making them kernel-schedulable does not pay (SDI: scheduler activations)
  - Two concepts inelegant, contradicts minimality

## We want ...

- Kernel-level threads
- The speed of user-level threads





























#### **Basic Idea**



- Assume IPC  $t_1 \rightarrow t_2$ , same address space
- Let t<sub>1</sub> execute t<sub>2</sub>-code
- Postpone real switch until the kernel is activated
- Pays if multiple lazy switches occur before first kernel activation, e.g.:
  - $t_1 \rightarrow t_2$ , work,  $t_2 \rightarrow t_1$ 
    - Costs 0 kernel-level IPC
  - client  $\rightarrow$  t<sub>1</sub>  $\rightarrow$  t<sub>2</sub>  $\rightarrow$  client
    - Costs 2 kernel-level IPCs





































#### **IPC Revisited**

 $A \rightarrow B$ : SendAndWaitForReply in user-mode call IPC function, i.e. push A's instruction pointer ; save A's stack pointer ;

if B is valid thread ID and thread B waits for thread A then

set A's status to "wait for B"; set B's status to "run"; load B's stack pointer; current thread := B;

return, i.e. pop B's instruction pointer else

more complicated IPC handling

fi.



Atomicity?

# Kernel Data?

### Atomicity

 $\label{eq:A} A \rightarrow B: SendAndWaitForReply in user-mode \\ call IPC function, i.e. push A's instruction pointer ; \\ save A's stack pointer ; \\ \end{cases}$ 

- restart point -

if B is valid thread ID and thread B waits for thread A then

- forward point -

set A's status to "wait for B"; set B's status to "run"; load B's stack pointer; current thread := B;

- completion point -

return, i.e. pop B's instruction pointer else

more complicated IPC handling

fi.







## Atomicity (2)



Interruption between forward point and completion point:

if is page fault

then

kill thread A

else

```
set A's status to "wait for B";
```

```
set B's status to "run" ;
```

load B's stack pointer ;

```
current thread := B ;
```

set interrupted instruction pointer to completion point

fi.

#### **Kernel Data**



A's TCB: stack pointer status

B's TCB: stack pointer status

current thread

## Stack pointer

Can be user accessible

Status

- User-level effects
  - Local to A's task can be ignored
  - Indirect effects on other tasks can be ignored
- System-level effects
  - Must be avoided
  - Validate values on change
  - Maintain twin variable in kernel

#### UTCB – KTCB





#### UTCB – KTCB





## **Current\_thread Inconsistency**



if CurrentKTCB.utcb != CurrentUTCB

#### then

```
/* Inconsistency found - check validity of user-level thread switch. */
NewKTCB := getKTCB(CurrentUTCB.myself) ;
```

if NewKTCB.myself = CurrentUTCB.myself and NewKTCB.space = CurrentKTCB.space and NewKTCB.utcb = CurrentUTCB

#### then

```
/* Valid user-level switch to valid thread in same address space. */
update kernel state ;
```

```
CurrentKTCB := NewKTCB ;
```

#### else

kill thread(CurrentKTCB)

fi

fi.



···.

esp0





Operating Systems Group Department of Computer Science

WAITING

В

RUNNING

Α

В







Operating Systems Group Department of Computer Science

32 05.07.2017 Jens Kehne | Marius Hillenbrand – Microkernel Construction, SS 2017







Operating Systems Group Department of Computer Science

05.07.2017 Jens Kehne | Marius Hillenbrand – Microkernel Construction, SS 2017

33













Operating Systems Group Department of Computer Science

**35** 05.07.2017



#### Kernel State Fixup – $A \rightarrow B$



36



Kernel State Fixup –  $A \rightarrow B$ 



Operating Systems Group Department of Computer Science

37

esp0

**Operating Systems Group** Department of Computer Science



## **LIPC** Chains



Т





esp0

C's USP

B's USP



U

Т

С



**Operating Systems Group** Department of Computer Science



U





### **LIPC Chains**

Karlsruhe Institute of Technology



### **LIPC Chains**

Karlsruhe Institute of Technology



### What About Priorities?





Operating Systems Group Department of Computer Science

# Safety & Security



Threads can only destroy their own task.

- Possible even without lazy switching.
- Threads can only cheat about their identity within their own task.
  - Affects only own task.
- Threads cannot modify their effective priority, uid, etc.

## **IPC Performance Promise – May 2001**





46

Operating Systems Group Department of Computer Science



**IPC Performance – Prototype** 

LIPC: 23 cycles
 1/15<sup>th</sup> of regular IPC (no sysops, no fastpath)

Overhead on IPC due to LIPC extensions

- 43 cycles intra-AS IPC
- 146 cycles inter-AS IPC
  - UTCB synchronization

Too much for real-world systems: P3 inter-AS IPC was only 236 cycles w/o LIPC support!

Overhead due to kernel fixup
???

# Limitations of LIPC



- Intra address space only
- Register-only IPC, no map/grant/string
- Always send and receive phase
- Infinite receive timeout

Tricky

Change from Wait\_for\_X to Wait\_for\_Any